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Abstract 

Given n equidistant realisations of a Levy process (Lt, t ^ 0), a. natural estimator 7V„ 
for the distribution function A'^ of the Levy measure is constructed. Under a polynomial 
decay restriction on the characteristic function tp, a Donsker-type theorem is proved, that 
is, a functional central limit theorem for the process y/n{Nn — N) in the space of bounded 
functions away from zero. The limit distribution is a generalised Brownian bridge process 
with bounded and continuous sample paths whose covariance structure depends on the 
Fourier-integral operator T~ ^ [I /^p {—»)]. The class of Levy processes covered includes several 
relevant examples such as compound Poisson, Gamma and self-decomposable processes. 
Main ideas in the proof include establishing pseudo-locality of the Fourier-integral operator 
and recent techniques from smoothed empirical processes. 

MSC 2010 subject classification: Primary: 46N30; Secondary: 60F05. 

Key words and phrases: uniform central limit theorem, nonlinear inverse problem, smoothed 
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1 Introduction 

A classical result of probability theory is Donsker's central limit theorem for empirical distri- 
bution functions: If Xi, . . . , Xn are i.i.d. random variables with distribution function F(t) = 
P((— cx),t]), t e M, and if F„(t) = P„((— (X),i]) where P„ = ^ '^^t the empirical mea- 

sure, then \/n{Fn — F) converges in law in the Banach space of bounded functions on R, to a 
P-Brownian bridge. The result in itself and its many extensions have been at the heart of much 
of our understanding of modern statistics, see the monographs Dudley (1999), van der Vaart and 
Wellner (1996) for a comprehensive account of the foundations of this theory. 

The purpose of this article is to investigate a conceptually closely related problem: at equidis- 
tant time steps tk — fcA, k — 0,1,..., n, one observes a trajectory of a Levy process with 
corresponding Levy (or jump) measure v, and wishes to estimate the distribution function N 
of v. Since we do not assume that the time distance A varies (in particular, no high-frequency 
regime), we equivalently observe a sample from an infinitely divisible distribution given by the 
i.i.d. increments of the process. Since v is only a finite measure away from zero the natural target 
of estimation is N{t) — i'{{—oo,t]) for t < and N{t) — i/([t, oo)) for t > 0. By analogy to 
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the classical case of estimating F, one aims for an estimator N such that ^/n{N — N) satisfies 
a limit theorem in the space of functions bounded on M \ (— > 0. Statistical minimax 
theory reveals that the problem of estimating is intrinsically more difficult than the one of 
estimating - it is a nonlinear inverse problem in the terminology of nonparametric statistics. 
We discuss this point in more detail below, but note that it implies that a rate of convergence 
\l\fn for — ^(fyi even only at a single point t, cannot be achieved (by any estimator TV) 
without certain qualitative assumptions on the Levy process. Particularly, the process cannot 
contain a nonzero Gaussian component. On the other hand, and perhaps surprisingly, we show 
in the present article that for a large and relevant class of Levy processes a Donsker theorem can 
be proved. 

Similar to Donsker's classical theorem our results have interesting consequences for statistical 
inference, such as the construction of confidence bands and goodness of fit tests. While we do 
not address these issues explicitly here and concentrate on spelling out the mathematical ideas, 
it is nevertheless instructive to discuss some related literature on statistical inference on the 
Levy triplet from discrete observations. The basic principle for understanding the nonlinearity 
in this setting is already inherent in the problem of decompounding a compound Poisson process, 
which has been studied in queuing theory and insurance mathematics. In this case the Levy 
measure u is a. finite measure and by explicit inversion in the convolution algebra Buchmann and 
Griibcl (2003) prove a central limit theorem with rate l/\/n for a plug-in estimator of N in an 
exponentially weighted supremum norm, assuming that the intensity of the process is known. 

For general Levy triplets the estimation problem is generally ill-posed in the sense of inverse 
problems. In fact, the linearized problem is of deconvolution-typc where the part of the error 
distribution is taken over by the observation law itself. This phenomenon, which could be coined 
auto-deconvolution, was first studied by Belomestny and Reifi (2006). For the general problem 
of estimating functionals of the Levy measure the results by Neumann and Reifi (2009) show in 
particular that a functional can be estimated at parametric rate i/\/n provided its smoothness 
outweighs the ill-posedness induced by the decay of the characteristic function. Comparing to 
Neumann and Reifi (2009) we are thus interested in the low regularity functional / i-^ f 
(not covered by their results), and in exact limiting distributions. Instead of making inference on 
the distribution function, one may also be interested in the associated nonparametric estimation 
problem for a Lebesgue density of the Levy measure, where the rate l/\/n can never be attained. 
This problem was studied in Gugushvili (2009) for Levy processes with finite jump activity and 
a Gaussian part, Comte and Genon-Catalot (2010) for a model selection procedure in the finite 
variation case, or Trabs (2011) for self-decomposable processes. Generalisations for observations 
of more general jump processes like Levy-Ornstein-Uhlenbeck processes or affine processes are 
considered by Jongbloed, van der Meulen and van der Vaart (2005) and Belomestny (2011). 

The proof of our main result contains certain subtleties that we wish to briefly discuss here: 
In the classical Donsker case one proves that the empirical process \fn[Pn — P) is tight in the 
space of bounded mappings acting on {l(_oo,tl '■ t € R}. The ill-posedness of the Levy-problem 
can be roughly understood, after linearisation, as requiring to show that the empirical process 
•s/n{Pn — P) is tight in the space of bounded mappings acting on the class 

= {7-i[lM-')] * l(-oo,t] : \t\ > C)}, (1-1) 

where C > is arbitrary, is the Fourier transform and where (f = J^P is the characteristic 
fimction of the increments of the Levy process. In fact, the situation is more complicated than 
that, but the above simplification highlights the main problem. Convolution with J^~^[l/ip] is 
just a way of writing deconvolution with P = T~^[ip], which is mathematically understood 
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as the action of a pseudo-differential operator, and the class can be shown not to be P- 
Donsker (arguing as in Theorem 7 in Nickl (2006), for instance), unless in very specific situations 
(effectively in the compound Poisson case discussed above). In other words, the empirical process 
is not tight when indexed by these functions. 

A starting point of our analysis is that for certain Levy processes a generalised P-Brownian 
bridge G"^ with bounded sample paths can be defined on Q^p, uniformly continuous for the intrinsic 
covariance metric of G''', see Theorem [HI Roughly speaking this means that a tight limit process 
exists, and that a limit theorem at rate l/\/n may hold if one replaces the empirical process 
by a smoothed one. This hope is nourished by the phenomenon - first observed, in a general 
empirical process setting unrelated to the present situation, by Radulovic and Wegkamp (2000), 
and recently developed further in several directions by Gine and Nickl (2008) - that smoothed 
empirical processes may converge in situations where the unsmoothed process does not. The 
results in Gine and Nickl (2008) apply to unbounded classes, so in particular to Q^, and this idea 
in combination with a thorough analysis of the pseudo-differential operator F~^\\/tf{—»)\ are at 
the heart of our proofs. 

The paper is organised as follows: Section [5] contains the exact conditions on the model, the 
construction of the estimator and the main result. In Section 3 the model assumptions, some 
important examples and potential extensions are discussed. Finally, the complete proof of the 
Donsker-type result is given in Section 4, divided into the finite-dimensional central limit theorem 
and the uniform tightness result. 



2 The Setting and Main Result 

We observe a real- valued Levy process (Lj, i ^ 0) at equidistant time points tk — kA, k ~ 
0, 1, . . . , n, for A > fixed. It will be seen to be natural (Section [S]) to restrict to Levy processes 
of (locally) finite variation. In this case the characteristic function of the increments Xk '■= 
Ltk - Lt^-i is given by 

ip{u) = E[exp(wiA)] = e'^'^^") where i}j{u) =i'yu+ [ (e™"^ - 1) v{dx) 

Jm\{o} 

with drift parameter 7 G M and Levy (or jump) measure v satisfying /^da;! A 1) v(dx) < 00 (due 
to finite variation). The increments Xi, . . . , Xn are i.i.d. and we write P for the law of X^ and p 
for its density (if it exists) as well as P„ = i J2k=i ^x^. and (pn{u) = FPn{u) = J e™^dPn{x) for 
the empirical measure and empirical characteristic function, respectively. Throughout J- denotes 
the Fourier (-Plancherel) transform acting on finite measures, on the space L^(IR) of integrable 
or on the space L^(R) of square-integrable functions on M, see e.g. Katznelson (1976) for the 
standard Fourier techniques that we shall employ. 

If v has a finite first moment, then the weighted Levy measure xi'^dx) can be identified 
directly from the law of Xk in the Fourier domain: 

^ "^'^""^ - -i^'iu) = 7 + / e™"a;z^(da;) = 7 + T[xiy]iu). (2.1) 



iA ip{u) 

Our goal is to estimate the cumulative distribution function of v 



U([<,c»)), t>0, 
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from the sample Xi, . . . , X„. Note that in general N{t) tends to infinity for t 0. If we denote by 
the inverse Fourier transform, then the relation (|2.ip suggests a natural empirical estimate 
of N{t) (we shall see below that 7 can be neglected), 



(x)dxwith5.(x):=|^;j(--;l(;)' '/"l' (2.3) 



where X is a band-limited kernel function and K^ix) := h ^K{x/h). In the sequel the kernel 
will be required to satisfy 

J K ^1, supp(J"X) C [-1, 1] and \K{x)\ + \K'{x)\ < (1 + \x\r^ for some /3 > 2. (2.4) 

Throughout, we shall write Ap < Bp if Ap ^ CBp holds with a uniform constant C in the 
parameter p as well as Ap ~ Bp if Ap < Bp and Bp < Ap . 

The smooth spectral cutoff induced by multiplication with J-Kh is desirable for various rea- 
sons; in particular, it will imply that iV„ is well-defined with probability tending to one. By 
Plancherel's formula, we have the alternative representation 

N.n{t) -^7^ I T9t{~u)'^TK>,{u)du. 
2mA (pn(u) 

Heuristically, for /i„ — ^ we expect consistency Nn{t) ^{t) in probability, t ^ 0, because 
as /i„ — >■ we have K^^ — >■ Sq (the Dirac measure in zero) and thus TKh„ (u) — > 1 which may 
be combined with the law of large numbers for both ipn and (/s'j . For this argument to work it 
is important to note that the drift 7 induces a point measure in zero for J-~^Yp' / ip] which is 
outside the support of gt, compare Section |4. 1 . II below. For our precise results we shall need the 
following conditions on the data-generating Levy process. Throughout the paper we often write 
tf~^ for 1/if. 

1 Assumption. We require for some £ > 0; 

(a) J max(|x|, |a;p+^) iy{dx) < 00; 

(b) XV has a hounded Lebesgue density and \T[xi']{u)\ ^ (1 + |u|)~"'^; 

(c) {l + \u\)-^+^^-\u)eL\M.). 

Assumption [ija) imposes finite variation, ensuring the identification identity (j2.ip . as well as 
finite (2 + £)-moments of v and P, since by Thm. 25.3 in Sato (1999) 

x\^+^v{dx) <oo ^ J \x\^+^P{dx) < 00. (2.5) 

As N is based on 1^9^ (m), and since a central limit theorem is desired, it is natural to require a 
finite second moment of Xk- The additional e in the power will allow to apply the Lyapounov 
criterion in the CLT for triangular schemes and to obtain uniform in u stochastic bounds for 
iPn{u) — (p'{u) over increasing intervals. Assumptions [IJb,c) are discussed in more detail after the 
following theorem, which is the main result of this article. 

For ^ > 0, let ■^°°((— 00, — C] U [C)C»)) be the space of bounded real-valued functions on 
(—00, — C] U [C, 00) equipped with the supremum norm. Convergence in law in this space, denoted 
by is defined as in Dudley (1999), p. 94. 
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2 Theorem. Suppose that Assumptions^ is satisfied, C > and hn ^ n ^/^(logn) for some 
p > 1. Then as n —t- oo 



in ^-((-(x.,-C]U[C,oo)), 



where is a centered Gaussian Borel random variable in £°°{{—oo, — C]U [C, oo)) with covariance 
structure given by 



Et.s — -rrr 



1 



1 



(x5,(x)) P(dx) 



and where gt is given in (|2.3p . 



In view of xgt{x) — l(-oo,t](2;) for ^ < and symmetrically for t > 0, the representation 
of the covariance in the theorem above is intuitively appealing when compared to the classical 
Donsker theorem. Its rigorous interpretation, however, needs some care, as it is not quite clear 
how the pseudo-differential operator •)] acts on the indicator function xgt(x). One 

rigorous representation that follows from our proofs uses 

[^-H-.)] * \-ooA = [(1 + ^u)-'cp-\~u)] * + St) 

together with the fact that J^^^ [{l+iu)^'^ (p^^ (—u)] can be shown to be contained in L-'^(]R)nL^(M) 
under Assumption [T] (using lifting properties of Besov spaces), so that the right-hand side of the 
last display is defined almost everywhere. 

Another more explicit representation, which also implies that Et.f < oo, is the following: Note 
that formally 



* {xgt{x))dP{x) 



1 

2^ 



{T[xgt]{-.)){u)^-\u)(p{u) du = {xgt){0) = 0, 



which explains why the covariance in Theorem [5] is centered for t ^ 0. Moreover, J-[xgt] 
i~^{J-[gt]y and integration by parts gives rise to the formally equivalent representation 



(*A)- 



ht{x)hg{x)P{dx) 



(2.6) 



where 



ht{x) - T-^[ip-^-u)J^gt{u)]{x)ix + T-^[{ip-^y{~u)Tgt{u)]{x), 



and where we note that i~^ht is real- valued. This expression for ht is the one we shall em- 
ploy in our proofs, as it can be shown to be rigorously defined in L^{P) under the maintained 
assumptions, see (j4.10l) below for more details. 

Moreover the last representation immediately suggests consistent estimators of Et_s based 
on the empirical characteristic function (p„ and the empirical measure P„, useful when one is 
interested in the Gaussian limiting distribution for inference purposes on N. 



3 Discussion 

3.1 The regularity conditions 

We remark first that the results in Neumann and Reifi (2009) imply that we can attain a l/\/n- 
rate for estimation only if the characteristic function decays at most with a low polynomial order. 



5 



This restricts the classes of Levy processes automatically to the (locally) finite variation case (e.g. 
proof of Prop. 28.3 in Sato (1999)), and moreover excludes all Levy processes with a nonzero 
Gaussian component. 

Let us next discuss Assumption [Tfc) which describes the lower bound we need on the ill- 
posedness of the estimation problem. It holds for all compound Poisson processes, in which 
case |(p~^(m)| is bounded, but also for Gamma processes with a £ (0,1/(2A)) and for pure- 
jump self-decomposable processes with not too high jump activity at zero, see Proposition [3] 
below. Recall (e.g. Sato (1999), Section 15) that self-decomposable distributions describe the 
limit laws of suitably rescaled sums of independent random variables as well as the stationary 
distributions of Levy-Ornstein-Uhlenbeck processes, and thus give rise to a rich nonparametric 
class of Levy measures. More generally, if E[e'"^i] decays polynomially, then there exists a Aq > 
such that for all A < Aq the corresponding characteristic function f{u) = E[e™^^] satisfies 
^ (1 + I'"!)" for a < 1/2, so Assumption [D^c) holds for any polynomially decaying (p 
if the sampling frequency is large (i.e., A small) enough. Abstractly, Assumption [T](c) means 
that the pseudo-differential operator T~^[ip~^] of deconvolution is an element of the L^-Sobolev 
space _ff^^+'^(M) of negative order e — 1. In the simpler problem of statistical deconvolution an 
analogous restriction for the characteristic function of the error variables is necessary, even if one 
is only interested in rates of convergence of an estimator, and the situation is similar here: The 
lower bound techniques from Theorem 4.4 of Neumann and Reil3 (2009) or Theorem 1 of Lounici 
and Nickl (2011) can be adapted to the present situation to imply, for instance, that for Gamma 
processes with a > 1/(2A) the 'parametric' rate l/y^ cannot be achieved by any estimator in 
the Levy estimation problem considered here, so that Assumption [Ijc) is in this sense sharp for 
Theorem m 

The smoothness condition on xv in Assumption [Ub) is not very restrictive: it is satisfied 
whenever the weighted Levy measure xv has a density whose weak derivative is a finite measure 
(noting XI/ G L^(R) by Assumption [If a)). As simple examples, any compound Poisson process 
with a jump density of bounded variation and a finite first moment satisfies this condition, as does 
any Gamma process. More generally, most self-decomposable processes satisfy this condition, see 
Proposition [3] below. 

The key role of Assumption [TJb) is not to enforce smoothness of v, but to ensure pseudo- 
locality of the deconvolution operator J^^^[ip~^] in the sense that the location of singularities like 
the jump in the indicator l(_oo,t] remains unchanged under deconvolution. A similar situation 
arises in standard deconvolution problems, see the recent paper Schmidt-Hieber, Munk and 
Diimbgen (2012). In the spirit of the theory of pseudo-differential operators this is established 
by differentiating in the spectral domain, see (|4.9I) below for details, 

under the condition that {^p^^)' = Atp'(p~^ S L^(E). Neglecting the drift, tp' is Tlixi/] and 
Assumptions [Ijb) , [TJc) together ensure {ip"^)' € L^(]R), see Lemma S] below. As discussed later, 
the example of a superposition of a Gamma and Poisson process provides a simple concrete 
situation where a violation of this condition renders the asymptotic variance in Theorem [5] 
infinite. 

There is another interesting interaction between Assumptions [ijb) and [TJc) . A decay rate 
for T[xiy]{u) is the maximal possible smoothness requirement under [TJc); otherwise 
\Re{ip' {u))\ ^ I J'[a:j^](u)| = o(|m|~-^) would imply \(p{u)\ — exp(Re{Aijj{u))) — exp(o(log(u))) 
for |w| — >■ cxD, excluding polynomial decay of the characteristic function (p. 
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3.2 Examples 

We now discuss a few examples in more detail. 

Compound Poisson Processes. The compound Poisson case where z/ is a finite measure is 
covered in Theorem [21 Note that due to the presence of a point mass at zero in P the char- 
acteristic function satisfies inf„|(p(u)| ^ exp(— 2zy(]R)) > (A = 1). Therefore Assumption 
[ijc) is trivially satisfied. Assumption [Tfb) requires that the law of the jump sizes has a 
density v such that xv{x) is bounded and has the respective decay property in the Fourier 
domain. Assumption[IJa) just postulates (2 + e) finite moments of the jump law. Compared 
to Buchmann and Griibel (2003) we thus obtain directly a uniform central limit without 
weighting, exponential moments and, perhaps more importantly, without prior knowledge 
of the intensity, yet our result holds only away from the origin and under Assumption [TJb). 

Stronger results can be obtained by adapting our method to this specific case because the 
distribution function TV of i/ is defined classically for all t € M and Assumption [TJb) is 
not required to ensure pseudo-locality of deconvolution. In fact, deconvolution reduces to 
convolution with a signed measure because of {v*^ denotes fc-fold convolution) 

k=0 

Therefore, J-^^[ip^^{~»)] * l(_oo,t] is a bounded function, in fact of bounded variation, and 
the uniform CLT for the linearized stochastic term follows directly (since BV-halh are 
universal Donsker classes). The remainder term remains negligible whenever the inverse 
bandwidth grows slower than exponentially in n. Choosing for instance hn ~ exp(— y^) 
yields a pointwise CLT for y/n{Nn{t) — N{t)) for all t G M if the bias is negligible, e.g. if 
N has some positive Holder regularity at t. We do not pursue a detailed derivation of this 
specific case here. 

Gamma Processes. The family of Gamma processes satisfies ^ r(aA, A), with 

probability density j{y;aA,X) = {l/r{aA))X"^y°"^^^e^-'^y , Levy measure i/(dx) — 
ax~^e~^^lTSi+{x) dx and characteristic function ip{u) = (1 — m/A)~"^. For simplicity we 
consider A = 1 and, in order to satisfy Assumption [Tfc), we restrict to a £ (0, 1/(2A)). We 
denote the density of r(/3, 1) by 7^ and its distribution function by F^. Then 

= - zu)"^-i(l - tu)] = 7i_„A * (Id+D) 

holds with the differential operator D. This is a well known form of the fractional derivative 
operator of order aA. We deduce 

J^^^iV^^i-')] * l[too) = 7l-aA(-») * (l[t.oo) - St). 

Hence, for t > the asymptotic variance of Theorem [5] is given by 

/•OO 

T.t,t= / {1 -Ti^aA{t - X) - '-fi-aA{t - x))'^JaA{x)dx. 
JQ 

Note that the integrand has poles of order (aA)^ a.t x = t and of order 1 — aA at x = such 
that the variance is finite if and only if aA < 1/2 and t 0. So, in this case. Assumption 
[TJ;) prevents T,tt from being infinite. 

Moreover, the Gamma process case can serve as a basic example for all the theory that 
follows. It reveals the problem that standard L^'-theory or non-local Fourier analysis will 
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not be sufficient in tfiis context as different locations of tlie singular support (the poles) 
are required to ensure finiteness of St,t. 

Gamma plus Poisson process. Let us briefly give a simple counterexample that pseudo- 
locality of the deconvolution operator is important. If the Levy process is a superposition 
of a Gamma process as above with a S (0, 1/(2A)) and of an independent Poisson process 
with intensity A > 0, the density p of the increments is given by the convolution of the 
TaA-density with a Poiss(A)-law and thus has poles of order 1 — a A at x e Nq. On the 
other hand, the deconvolution operator is given by 

= ^ ^-^^5^k * 7l-aA(-.) * (Id -I?) 
fc=0 

fc=0 

As in the pure Gamma case, this shows that St^t is finite if and only if none of the poles 
a,t X — t — k, k £ No, and at a; = fc. A; € No, of the respective functions coincide, which 
is the case only for non-integer t ^ Nq. Consequently, we cannot hope even to prove a 
pointwise CLT with rate l/\/n at integers t. This case that singularities are just translated 
by convolution with point measures is excluded by the regularity requirement for xv in 
Assumption [IJb) . 

Self-Decomposable Processes. We finally consider the class of self-decomposable processes, 
cf. Sato (1999), Section 15, which contains all Gamma processes. For any pure-jump self- 
decomposable process we have v{dx) = k{x)/\x\dx with a unimodal fc-function increas- 
ing on (—00,0) and decreasing on (0,oo). If the limits fc(0— ) and A;(0-|-) of k at zero 
are finite, then A: is a function of bounded variation and so is sgn(a;)fc(a;), the density of 
XV. The moment condition of Assumption (Tf a) in particular implies sgn(a;)/c(x) £ L^(M) 
which yields Assumption [IJb) . It is quite remarkable that the probabilistic property of 
self-decomposability implies the analytic property of pseudo-locality for the deconvolution 
operator. 

For the characteristic function of self-decomposable processes we have |<i5(u)| > (1 + |u|)^"^ 
with a = fc(0— ) -I- fc(O-l-), which follows exactly as the proof of Lemma 2.1 in Trabs (2011). 
The latter is the counterpart to Lemma 53.9 in Sato (1999), where an upper bound of 
the same order times a logarithmic factor is shown. We conclude that Assumption [Ijc) 
translates to the condition a < 1/(2 A). 

We note that Assumption [Ija) and[T]^b) remain true under superposition of independent Levy 
processes and we collect the findings in an explicit statement. 

3 Proposition. Assumption[l\ is satisfied for 

(a) a compound Poisson process whenever the jump law has a density v such that xv is of 
hounded variation and v has a finite (2 + e)-mom,ent, 

(h) a Gamma process with parameters a £ (0, 1/(2 A)) and A > 0, 

(c) a pure-jump self- decomposable process whenever its k-function satisfies 
J max(l, \x\^+^)k{x) dx < oo and fc(O-) + k{0+) < 1/(2A), 

(d) and for any Levy process which is a sum of independent compound Poisson and self- 
decomposable processes of the preceding types. 
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3.3 Extensions and perspectives 

There are many directions for further investigation. As from the classical Donsker result, concrete 
statistical inference procedures, like Levy-analogues of the classical Kolmogorov-Smirnov-tests 
and corresponding confidence bands, can be derived from Theorem [21 Also extensions to uniform 
CLTs for more general functionals than just for the distribution function are highly relevant. A 
question of particular interest in the area of statistics for stochastic processes is whether one can 
allow for high-frequency observation regimes A„ 0. As discussed above, decreasing A — 
renders the inverse problem more regular, as Assumption [ijc) is then easier to satisfy. Since 
we use the central limit theorem for triangular arrays in our proofs, allowing A to depend on 
n should not pose a principal difficulty, but doing so in a sharp way may not only require an 
estimator based on the second derivative of \og{(pn), but also extra care in controlling all terms 
uniformly in n, and is beyond the scope of the present paper. 

Another issue of statistical relevance is the question of efficiency, which we briefly address 
here. Our plug-in estimation method is quite natural and should have asymptotic optimality 
properties as the empirical distribution function has for the classical i.i.d. case. This is also in 
line with the result by Klaassen and Veerman (2011) who show that the tangent space of the class 
of infinitely divisible distributions with positive Gaussian part is nonparametric to the effect that 
the estimation of linear functionals / g dP of P (but not v as in our case) by empirical means is 
asymptotically efficient. Indeed, a formal derivation indicates that the pointwise asymptotic vari- 
ance of our estimator Nn(t) coincides with the semiparametric Cramer- Rao information bound 
(see van der Vaart and Wellner (1996), Chapter 3.11, for the relevant definitions). Let us restrict 
here to the case t <Q and assume that the observation law P^, has a Lebesgue density p^. 

Perturbing the Levy measure v in direction of an L^-function /i, we obtain by differentiating 
in the Fourier domain the score function (the derivative of the log-likelihood) 



d Vv+eh I 



de pi, 



<y9^(u) /(e™^ - l)h{dx) 



with Xh = J h. This yields the Fisher information at measure v in direction h as 

On the other hand, we aim at estimation of the functional v M- whose derivative in direction 
h by linearity is given by -ff(t) — (l(-oo.t]j^) (interpreting (•, •) as a dual pairing). The semi- 

parametric Cramer- Rao lower bound is then sup^ IJlyjhK)^ maximising the parametric bound 
over all sub-models {v ^ eh)^^^. The supremum is formally attained at h* — /(i/)"^l(_oo,t] with 
value (l(_oo,t]i ^*)- The maximiser can be expressed explicitly using the deconvolution operator: 

h* = T-'[ip-'] * {p^ X (^T-'[^-\-u)] * - T-'[ip~\-u)] * l(_oo,t](0))}. 

Resuming the formal calculus and noting that J^^^[(p^^{~u)] is the formal adjoint of 
we find the explicit Cramer-Rao bound 



'i-{-oo,t]{x)h*{x)dx ^ (t ^[ip ^(-?i)] * l(_oo,t])(a;)Pi.(a;)(j^ ^[ip ^(-u)] * l(_oo,t]) (a:^) 



dx. 



which is exactly equal to the asymptotic variance l^t,t from Theorem [51 We have used here that 
J"~^[((9~1(-m)] * l(^_^j.]{X) is centred, cf. (IT^ below. 
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The hardest parametric subproblem of our general semi-parametric estimation problem is thus 
given by perturbing v in direction of h* . The lower bound for the variance equals exactly the 
asymptotic variance of our estimator. Let us nevertheless emphasize that this formal derivation 
of the Cramer-Rao lower bound does not justify asymptotic efficiency in a completely rigorous 
manner: for this one would have to establish the regularity of the statistical model and h* E 
i^(R), which appears to require an even finer analysis of the main terms than our Donsker-type 
result. The complete proof remains a challenging open problem. 

4 Proof of Theorem [2] 

The remainder of this article is devoted to the proof of Theorem[21 which is split into the separate 
proofs of convergence of the finite-dimensional distributions and of tightness. We shall repeatedly 
use the following auxiliary lemma. 

4 Lemma. Suppose 7 0. Then As sumption]^ implies: 

(a) The measure xP — xP{dx) has a hounded Lebesgue density on R. 

(b) {ifi-^y e L^{R) n L°°(R) as well as \(p-\u)\ < (1 + for all u € R; 

(c) m{u) :— ip^^{—u){l + iu)^^^^'^')/'^ is a Fourier multiplier on every Besov space Bp ^(R) with 
s £ R, p, q G [1, 00]; that is convolution with T^^m is continuous from Bp ^(R) to Bp ^(R). 

Proof. 

(a) From (|2.ip with 7 = we see 

T[ixP]{u) ^ ip'{u) = iAT[xi^]{u)TP{u) ^ xP = Aixu) * P 

and thus with xv (Assumption [ijb)) also xP has a Lebesgue density xp{x) with ||xp||oo 
Allxi'lloo- 

(b) From Assumption [TJb) and 7 = we deduce ^ (1 + I""!) "'^ ^-nd thus ||(1 -t- 
|u|)'(i^^"'")'||i2 < + |?i|)^-'-+'^||^2 < cx) by Assumption (Tfc) . This implies 

w-'\u)\ ^ 1+ \{^-'y{v)\dv<i + \\{i + \v\r{^-'yu4a + \v\r'iioM\\L^ 

Jo 

< {l + \u\)^'/'^-' <{l + \u\)^'-'y\ 
and then also \{ip-^y\{u) < \ip-\u)\\Tp' {u)\ < 1, so {ip-^y £ L°°(R). 

(c) The Fourier multiplier property of m follows from the Mihlin multiplier theorem for Besov 
spaces (see e.g. Triebel (2010) and particularly the scalar version of Cor. 4.11(b) in Girardi 
and Weis (2003)): because of (b) the function m is bounded and satisfies 

\um'{u)\ < \um{u)\{l + \u\y^ < 1. 

Consequently, the conditions of Mihlin's multiplier theorem are fulfilled and m is a Fourier 
multiplier on all Besov spaces Bp ^(R). 

□ 
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4.1 Convergence of the Finite-Dimensional Distributions 

Denote by H''(M.),s E M, the standard i^-Sobolev spaces with norm \\J-h{u){l + 

5 Definition. We say that a function g E L°° (M) n (M) is admissible if 

(a) g is Lipschitz continuous in a neighbourhood of zero, 

(h) we can split g — g'^ + 5* into functions g'^ G i?"'^(M), 5" G L^(M), satisfying 
max(|J^[5"](w)|, \J'[xg']{u)\) < (1 + \u\)-'^ for all m G M. 

6 Lemma. The functions gt from (j2.3p as well as all finite linear combinations cti^t . with 
Ui G K, 7^ 0, are admissible. Moreover, we can choose gf,gf in such a way that 

UWh^ < (1 + 1^1)"'/', 1-^5*^(^)1 < (1 + + \t\r' and \J^[xgt]{u)\ < (1 + \u\)-\ 

the inequalities holding with constants independent o/u G K, i G M\(— C, C) for C > fixed. 

Proof. First note that ah properties of admissible functions remain invariant under finite linear 
combinations and reflection g 1— > (?(—•). It thus suffices to check that gt, t < 0, is admissible. Let 
X G C°°((— 00,0]) be a smooth function with x(0) — 1 and XiX' both bounded and integrable 
on (—00,0], for instance xi^) — e^l(_oo,o]- Decompose gt — gt + dt with 

9t{x) = 9t{x){l-x{x-t)), gt{x) ^ gt{x)x{x-t); ioTx^t, 

and both equal to zero for x > t. Then g^ G i^(R) and its (weak) derivative is 

(g^yix) = _ - t))l(-oc,t](a:) + x-\l - x{x - t))'l(_oo,t](a:) e i'(M), 

so gt G iJ^(R). The functions gf, xgf are both integrable since x is- The (weak) derivatives of xgf 
and gl are x'(a;-Ol(-oo,t) -(^t and -x~'^x{x-t)l(-r>o,t] +x~^x' {x-t)l(^-r>o,t)-t~^St, respectively, 
with point measures 6t. So, both functions are of bounded variation and their Fourier transforms 
are bounded by (1 + |m|)~^ up to multiplicative constants. Finally, observe that gt is constant 
and thus Lipschitz near zero, so that gt is admissible. 

For the second claim we again only consider t < and first observe, x being bounded, that 

\\9t\\h< f \A-^-\t\-' 



as t — >■ —00. Likewise, using the explicit form of (5^)', we see 

\mm<\\9t\\L^ + \M)'\\L^<{l + \t\)-''^- 

For gf — x^^l(^oo,t]X{x—t) we see ^ i "'^HxIIlIj and the total variation of the derivative of 

gl is bounded by i-2||x|lLi+t"^||x'||Li We conclude that \Tgl{u)\ < {l + \u\)-^{l + \t\y^ 

holds. The same argument gives a bound independent of t for |J^[a;(7^](u)|, thus completing the 
proof. □ 

7 Theorem. Suppose A s sumption [1\ is satisfied, g is admissible and hn ~ n~^/^(log n)^'' for 
some p > 1. Then setting 

Mg) -J^[ 9{x)J'-\{v'JVn)TKhMx)dx, N{g) ( g{x)xv{dx) 
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(with some abuse of notation N(t) — N{gt) etc.), we have asymptotic normality, 

V^(iV„(.g) - 7V(.g)) A^(0,a2) 

as n — ^ cx) with finite variance 

= (iA)-2^ (j--i[-F.gM^-^(-ii)](x)zx + ^-M-F5(«)(v^-i)'(~^)](a:))'F(dx). 

8 Corollary. Under the assumptions of the preceding theorem the finite- dimensional distributions 
of the processes {^/n{Nn{t) — N {t)),t €zR\{0}) converge to asn^oo, whereGj"^ is a centered 
Gaussian process, indexed by R\{0}, with covariance structure given by p.6p for t,s £ R\{0}. 

Proof. This follows directly by the Cramer- Wold device applied to any finite subfamily of {gt, t £ 
M\{0}), using the preceding lemma and theorem. □ 

The remaining part of this subsection is devoted to the proof of Theorem [T] 
4.1.1 Discarding the drift 7 

We shall show that we may assume 7 = in the sequel. To see this, observe that shifting 
Xk I— Xk = Xk + 7 leads to the shift in the empirical quotient 

'Pniu)/^niu) ^ (p'^{u)/(pr,{u) = (e™'' (^n)' (u) /(e™^^„ (u) ) = i-y + ip'^{u) / ipn{u) 

and the true quotient also satisfies ip'{u)/ip{u) = i^ + ip' {u) / ip{u) . In Nn{g)~ N{g) this shift thus 
induces the error 



^ / g{x)J^-'[ij{TKh-l)]{x)dx = ^ / {g{x)-g{0))Kh{x)dx 



< 



\9\\up{0)\x\\Kh{x)\dx + / ||.g||oo|-K'/i(a;)|da; 



< \x\h-~H:^/\\x/h\~^)dx+ / {l + \u\)-^du<h, 

where we have used the Lipschitz constant of 5 in a (5-neighbourhood of zero and (|2.4p with 
/? > 2. By the choice of /i = ft.„ this error is of order 0(/i„) = o(n^^/^) and thus negligible 
in the asymptotic distribution of ^/n(N(g) — N{g)), and we note that this bound is uniform in 
all g satisfying the admissibility conditions with uniform constants. Henceforth, without loss of 
generality, we shall only consider the case 7 = 0. 

4.1.2 Approximation error 

By approximation error we understand here the deterministic 'bias' term 
1 fx-/ ^^'(«) 



^9{~u)^^-j^FKh{u)du- I Fg{-u)^^^du 



induced by the spectral cutoff with J^Kh. We use Assumption Hfb), i.e. that — 
\jr[xi/] (u) 1^(1 + Moreover, we split g = g^ + g^ and treat the bias of each term separately. 
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For the term involving g'^, using the Lipschitz continuity and boundedness of J-K (due to 
with 13 > 2), 



27rA 



< 



< 



(1 + \u\)~'\tP'{u)\\1 - FK{hu)\ du 
(1 + min{h\u\, 1) du 



< h\og{h-~^). 
For we have by the Cauchy-Schwarz inequality 

^ ^ Fg''{-u)'^^{l-FKh)[u)du < f {1 + \u\)\Tg''{~u)\{l + \u\)-^h\u\du 



27rA 



<hU\\m[ j {l + \u\)-^du) 



1/2 



Combining these two estimates, and since h = hn = o{n ^1"^ log(n) we conclude that the bias 
term is of negligible order o{n~'^l'^) in the asymptotic distribution of y/n{N{g) — N{g)). 

4.1.3 Main stochastic term 

Linearising the error in the quotient ip'^j we identify two major stochastic terms: 
with remainder 



= V HuWn - ^')(") + if 'yiu){iPn - f){u) + Rn{u) 



Rn{u) 



(4.1) 



ip{u) J \ ipn{u) (p{u) 

where we used the identity ip^^(p' + (93^^)V = iv^^'Py = 0- Discarding the remainder term for 
the time being, we study the linear centered term 



27riA 

~ 27ri 
1 



^— j^Fg{^u)FKn{u)(ip-\u){^'^ - ^'){u) + (^-i)'(u)(^„ - ^)(u)) 
/ Fg{-u)FKh{u)U-\u)v'^{u) + {^'^)'{u)vn{u))du 

Fg{-u)FKh{u)L-\u)F[ixPr,]{u) + [ip-^y {u)F[Pn]{u)) du 



du 



27riA 
1 



(^F-^ if-^{-u)Fg{u)FKh{-u) {x)ix + {Lp-^)' {-u)Fg{u)FKh{-u) {x)j P„idx). 

(4.2) 



These manipulations are justified by standard Fourier analysis of finite measures, using the 
compact support oiFKh and of P„ as well as that (1 + \u\)~^ ip~^ {u) , JF g , {'p~^)' are aU in L-^(R) 
(by virtue of Assumption [IJc) , admissibility of 5, Lemma Sfb)). 

Thus, the central limit theorem for triangular arrays under Lyapounov's condition (e.g. The- 
orem 28.3 combined with (28.8) in Bauer (1996)) applies to the standardised sums if 



sup 

/ie(o,i) 



2+e 



^-^{-u)Fg{u)FKh{-u) {x)ix + (ip-^)' {-u)Tg{u)TKh{-u) (x) P{dx) 

(4.3) 
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is finite. 

We use the decomposition g — g^+g^ and deal with g'^ first. We have from the Cauchy-Schwarz 
inequality, Assumption [IJc) and admissibility of g 

jjF[g-]{u)\\^-\-u)\du^U\\H^\\:F-^[ip-^]\\H~^ <<:^. (4.4) 

Since also sup,j>o,«l-^-^'i(")l ^ ll-^IU^ < oo we have F[g'^]ip^^{—»)FK}i £ L^(M) and thus 

sup J-'^[ip-\-u)Tg''{u)TKhi-u)] e L°°(M). 

he{Q,l) 

The integral over the first term in (j4.3p with g'^ replacing g is thus finite in view of J\x\'^~^'^ P{dx) < 
00 by Assumption [IJa) . 

For the singular part we remark KJ-Kh)' {u)\ ^ Hxi^/iH^i < h as well as (by Assumption 
[T](b)) \{ip~^Y{u)\ — A\^'{u)(p^^{u)\ < (1 + |u|)^"^|i^^"^(m)|. We conclude uniformly in h, using 
admissibility of g, 

\{^-\-,)Tg^TK,{-.)yiu)\ < 1^-1(^)1(1 + \u\)-\ 
By Assumption [TJc) and the Sobolev embedding this implies 

supT-'^[ip-\-u)Tg''{u)TKhi-u)]ix){l + ix) G H^R) C L^+%R). (4.5) 

h 

Using Lemma mja) and jxp"'"'^ ^ l^^lll + also the integral over the first term in (j4.3|) with 

5* replacing g is finite. 

For the integral over the second term in (14. 3p we recall sup^^Q „| J^i^/i(w)| ^ < 
and that Tg, ((/J-^)' are both in L'^(&.) to deduce \Tg{u)J'Kh{-u){ip-^y{-u)\ e ^^(R) by the 
Cauchy-Schwarz inequality. By Fourier inversion J^~^[J^g{u)J^Kh{—u){ip~^Y{—u)] e L°° holds, 
and since P is a probability measure, also the integral over the second term is finite. 

Altogether we have shown that under our conditions the main stochastic error term is asymp- 
totically normal with rate l/V^ and mean zero. For n — ^ oo the variances converge to cr^, which 
follows from J^Kh^ — > 1 pointwise and uniform integrability by bounded (2 + £)-moments. 

AAA Remainder term 

In what follows Pr stands for the usual product probability measure describing the joint law 
of Xi,X2, . . . , and Z„ = Opljn) means that r~^Zn is bounded in Pr-probability. We show that 
the remainder term is Op(r„) for some r„ = o(ri~^/^), and therefore negligible in the asymptotic 
distribution of y/^{N{g) - N{g)). 

From Theorem 4.1 of Neumann and Reif3 (2009) we have for any 5 > Q, using the finite 
(2 + £)-moment property of P from p.Sp . 

sup (W^{u) - ^{u)\ + \^'^{u) - ^'(^)|) = Op(n- V2(logC/)V2+5). 

This implies in particular, using 

inf |^(u)|> inf (l + |u|)-i/2>\/X„>7i-i/4(logn)-''/2 (45) 
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from Lemma m^b) , that for any constant < k < 1, 



Pr 

= Pr 
^ Pr 



< 



ip{u 



if{u 

(Pn{u) - (fiu) 



ip{u) 



for some u £ [—h~^, h^^ 
> for some u G [— /i,^^, /i^^]^ 

> {k^^ — 1) for some u G [—h~^, 



< Pr sup \ipn{u) - ip{u)\ > n-^^^{logn)-P^^ 

as n cxD, in other words, on events of probabihty approaching one, Lp~^ decays no faster than 
ifi^^ uniformly on increasing sets [—h^^, h^^]. 

Now to control the remainder term (j4.ip we use supp(J^if^) C [—h^^,h~^] and distinguish 
each term of the decomposition g — + g^- First, using \Fg^{u)\ < (1 + Lemma |4l^b) 

and Assumption [ijc) we see 



F g" {-u)F Kh{u)Rn{u) du 



= 0p 



(1 + \u\)-^n-\\ogh-^Y+^'\^-\u)\{\^{u)-'\ + \{ip-')'{u)\) du 



= Op(n-\\ogh-^)^+^^h^'-^ J {1 + \u\)-^+^'\^{u)\-^du) 

= Op{n-\\ogh-^)^+^'h^'-^). 
For the nonsingular part we have likewise, using the Cauchy-Schwarz inequality, g'^ e H^{M.), 



1/2 



(|4.6p and Assumption [IJc), 

J Fg''{-~u)FKh{u)Rn{u)du ^Op(n-^{\ogh-^Y+'^^(^j {I + \u\)-'^\ip{u)\-'^ du 

= Op(n-i(log/.-i)i+^^/.-V2||^-i(l + 

Consequently, the remainder term is negligible because h^^^"^^ (logh^^Y^"^^ ~ o(n^/^). Note 
that this gives in fact uniform op(n^^/^)-control of the remainder term for all g that satisfy the 
admissibility bounds uniformly. 

4.2 Tightness of the Linear Term 

We study the linear part (I4.2p and introduce the empirical process 
1 



Vn—r (t ^ tp ^{-u)Fgt{u)TKh„{-u) {x)ix + 



(4.7) 



iip-^Yi~u)J^gtiu)J^Kh^i-u)\ (a;)) (P„ - P)idx), \t\ ^ C > 0. 



Recall that this process is centered even without subtracting P. Moreover, since sup|j|^^ llfftlU^ < 
oo, the arguments after (14.21) imply that is a (possibly non-measurable) random element of 
the space £°°{{—C, CY) of bounded functions on (— oo, —(] U [(, oo) (the complement of (— C, C) in 
M) equipped with the uniform norm 
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4.2.1 Pregaussian limit process 

Theorem [2] will follow if we show that i^!^ converges to G*^ in law in (°°{{—C,CY)- For this 
statement to make sense we have to show first that defines a proper Borel random variable 
in t°°{{—C, CY)^ which is implied by the following more general result. Recall that any Gaussian 
process {G(i)}tgT induces its intrinsic covariance metric (f{s,t) = E{G{s) -G(t))'^ on the index 
set T. 

9 Theorem. Grant Assumption]^ The Gaussian process {G'^{t)}t.^t\^i^ with covariance given by 
(|2.6p admits a version, still denoted by G'^ , which has uniformly continuous sample paths almost 
surely for the intrinsic covariance metric ofQ^, and which satisfies supj.|j|^^ |G'''(i)| < oo almost 
surely. 

The proof moreover implies that (— C, C)*^ is totally bounded in the metric d. Therefore (a 
version of) G"^ concentrates on the separable subspace of ^°°((— CiC)^) consisting of bounded 
d-uniformly continuous functions on (— C, CYj from which we may in particular conclude that G'^ 
defines a Borel-random variable in that space, and hence is also a Borel random variable in the 
ambient space ^°°((— CiC)'^)- 

Next to Dudley's entropy integral, the main tool in the proof of Theorem |9] is the following 
bound for the pseudo-differential operator T~^[(p~^{—u)]. For / G L^(R) we set J^~^[(p~^{—»)] * 
f := J^^^[ip^^{—u)J^f{u)] which is well defined at least in _ff (^^'^^/^(M) in view of Lemma S) 
Alternatively, [(/?-!(-.)] */|U2 < ||(1 + |w|)(i-^)/2j-/(w)||i2 whenever / e but 
such an inequality is not sufficient for our purposes. We need a stronger estimate for functions / 
supported away from the origin, and with the ||»||/^2-norm replaced by the ||»||2,p-norm. Intuitively 
speaking, and considering the example / = l(<,,t],s < t < 0, relevant below, this strengthening 
is possible since the locations of singularities of ^{s,t] and of P (at the origin) are separated 
away from each other, and since this remains so after application of the pseudo-local operator 
T-^[ip-H-,)]*{.) to /. 

10 Proposition. Grant Assumption\J\ and define \\h\\2,p := (/ h'^dPy^'^. For f £ L^(M) with 
supp(/) n (—(5, (5) = for some 5 > Q we have 

\\F-\^-\-u)] * fh^p < 11(1 + |u|)i-^'J-/(^)L2+4/.(K) + ( j dyf" (4.8) 

provided the right-hand side is finite. The constant in this bound depends only on 6. 
Proof. We shall need the pseudo-differential operator identity 

{^''[v-\-u)]* f){x) ^ [[^^:F-'[{^-\-u)y]) * f){x), / e L2(R), X ^ supp(/), (4.9) 

where the right hand side is defined classically. This identity is fundamental for establishing 
the property of pseudo-locality in a C°°-framework, see e.g. Theorems 8.8 and 8.9 in Folland 
(1995). Let us verify this identity here, where ip~^ ^ C°°. Consider / G L^(R) and g any smooth 
compactly supported test function such that supp(/) fl supp(g) = 0. Then (/ * g{—»)){0) = 
and f*g IS smooth from which we may conclude that also x^^{f * g{—m)){x) (equal to (/*5)'(0) 
at zero) is in L^(R) and smooth, and that 



(u) = F[f * g{~.)]{u) = :Ff{u):Fg{u). 



J" 



IX 
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Plancherel's formula, integration by parts and Fubini's theorem (using {(p G i^(M) from 
Lemma S] and the support properties) yield 



iT-'[^-\-u)]^f){x)g{x)dx^ 



1 

-1 
2^ 



T-'[{^-H-u)y]{x) 

ix 



(u) du 



if * 9i~'))i-x) dx 



f {x)g{x) dx. 



In this calculation the boundary terms vanish due to the fast decay of J-[{f * g{—»)){x)/x\ {g 
smooth). Consequently, (|4.9p follows by testing with all g supported near x. 

We use Holder's inequality, the Hausdorff- Young inequality from Fourier analysis, the bound 
p{x) ^ from Lemma|4j the pseudo-differential operator identity, again Holder's inequality. 

Assumption [Ijc) and (<y5~^)' € in view of Lemma|l]in this order to obtain for 6' = 6/2: 



iT-'[ip-H~u)] * f)'ix)P{dx) dx ^ \\T-'[ip-'{-u)] * /|li2+.(R)|bL(2+„/.([_,,,5,].) 

+ \\^-'[^-\-u)].f\\l^^[_s,^S'])Pi[-S',S]) 

<|l^-i(-«)-F/||^,.+.,,<,+.,||xp||oo(<5')"'/^'+^^ 
+ ||(J-M(^-^)'(-«)](x)/:r)*/||i^(f_,,,,j) 



sup 



< 



\\ii + \u\y-^Tfiu)\\u,,.+ 



{x - 2/)2 



1 



y 



dy 

2 



provided / is such that the last line is finite. Take square roots to deduce the asserted inequality 
with a constant independent of /. □ 

Proof of Theorem [PI We consider the generalised Brownian bridge process arising as the point- 
wise weak limit of (|4.7L so with TKh = 1, and further split gt — gt + 9t the proof of 
Lemmaini More precisely, we study the Gaussian process indexed by (iA)~^ times 

htix) = F-^[{ip-^)'{-u)Fgt{u)]{x) + {F-^[<p-\-u)Fgt{u)]{x)) ix (4.10) 
= T-^[{^-^)\~u)Tgt{u)]{x) + {T-^[^-\-u)Tgt{u)]{x) + T-^[^-\-u)Tgt{u)]{x)) ix, 

where \t\ ^ The theorem is thus proved if we show that the class of functions Q — {{iA)~^ht : 
t e M\(-C,C)} is bounded in L'^{P) and P-pregaussian (cf. Dudley (1999), Chapter 2, p.92-93). 
In Section |4 . 1 . 31 above we have shown the i^+'^(P)-boundedness of the same function class, but 
also involving the kernel Kh- The same proof, replacing J^K^, just by one, shows that Q is even 
L^+^(P)-bounded. To establish that Q is pregaussian it suffices, by Dudley's integral-criterion, 
to find a suitable ?7-covering of Q in the intrinsic covariance metric d[s,t) := \\ht — ft.s||2,p, for 
every ht,hs e Q. 
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Consider first increments for s < t,\s — t\ ^ l,min(|s|, \t\) ^ C, 

htix) - hs{x) - T-'[{ip-'y{~u)T[gt - gs]{u)]{x) + T-'[ip-\~u)T[gt - 9s]{u)]ix)ix 
= iT^'^[ip^'^ {~u)T[x{gt - gs)]{u)]{x) 

for wliicli Proposition [TUl yields, with / = l(s,t]i tlie Holder-type bound 

\\ht - h^Wlp < \\sm{{t - s)u)u-'{l + \u\y-'\\l.^,,. + \t - .s\ < \t - .sf(3+^^)/(^+^). 

This will give us a polynomially growing covering of Q for all t in a fixed compact interval. 

To deal with large |t| we shall establish the polynomial decay bound ||/it||2,p ^ as 
\t\ — )■ oo, and we shall do this for each of the three terms in the second line of (|4.10p separately. 

For the first term, say h[^\ this follows from 

\\h['^\\ip < whi'^wu ^ \\i^-'n-')^9t\& < ii(^-')'(-.)ii2ii5tiii^ < r \^\-' - \t\-' 

J —oo 

as t — > — oo, and likewise for i — > oo, using the Cauchy-Schwarz inequality and Lemma |4l[b). 

(2) 

For the second term we use the Cauchy-Schwarz inequality, the finite second moment of 
P, Assumption [IJc) and Lemma IH] to the effect that 

\\h?^\\ip ^ I x'P{dx)\\ip-\-u)^gKu)\\h < \\^-'[v>-'WH-49i\\m < (1 + 1^1)"'- 
For the third term, since xP has a bounded density by Lemmata), it suffices to bound 

\\^-'y-\^u):Fgl]{x)\xnL^, 
which by the Cauchy-Schwarz inequality can be estimated by 

Now by LemmaHwe know \Tgl{u)\ < (1 + + \J'[xgl]{u)\ < {1 + \u\)-'^ and since 

< (1 -I- from the proof of Lemma 2] we can estimate the product in the 
last display to obtain the overall bound 

Wh^^.p < (1 + \t\)-'/'y-H-u)ii + \u\)-'h^ < (1 + \t\r'/' 

in view of Assumption [ijc) . 

In conclusion, we can construct an ry-covering of G by the functions {iA)^^ht^ with ti = i/M 
and i = —A'P,...,+A'P where M — M{rj) grows polynomially in 77^^. This shows that the 
covering numbers corresponding to this ?7-net satisfy 

log(A^(g, L\P), ry)) < log(77-i). (4.11) 

The square-root of this entropy bound is integrable at zero as a function of 77, which completes 
the proof by Dudley's continuity criterion (Theorem 2.6.1 in Dudley (1999)). □ 
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4.2.2 Uniform CLT for the linear term 
11 Theorem. Grant Assuinption[I\ and 

as 71 -> oo for every finite set {ti, . . . , tk) C (— C, C)'^- V ?J n^^/^'^°'^ for some a > (1 — e)/2, 
so in particular if hn ~ n~^/^(logn)~'' for some p> 1, then 

i^;?^^G^ in ^°°((-c,cn 

as ri —> oo. 

Proof. We set A = 1 and suppose that the kernel is symmetric, i.e. FKh{~u) — FKh{u), to ease 
notation. Given convergence of the finite-dimensional distributions it suffices to prove uniform 
tightness of {y^}nm in ^°°((-C, C)''), cf. van der Vaart and Wellner (1996), Chapter 1.5. We shah 
in what follows decompose into a sum of several processes indexed by t, and prove tightness 
of each of these processes separately, which implies tightness of the sum of the processes by the 
asymptotic equicontinuity characterisation of tightness in ^°°((— CiC)'^) (e-g-i Theorem 1.5.7 in 
van der Vaart and Wellner (1996)) and by the triangle inequality. We shall also frequently use 
the simple fact that tightness is preserved under isometric injections of i°°{{—(,,C,Y)'- if is a 
process indexed by s and v' a process indexed by functions /s G J^, and if v{s) = v'^fs) for 
every s S (— CjO^i then tightness of v' in (normed by := supjgjr \H{f)\) implies 

tightness of v in ^°°((-C, C)'')- 

We decompose 5t = + g| as in the proof of Lemma |5] with the particular choice x{^) = 
e'^l(_oo,o](2;) fo^' i < 0, and symmetrically if t > 0. The integrand of i^^(i) in (14. 7p equals 

=: (ri + r2 + T3 + r4)(x) 

The process indexed by the component Ti is critical and its tightness is proved in Section [4.2.31 
below. 

Concerning T2, we have \(p^'^{—u)Fg^F[ixKh]\ < |(^^-^(— w)|(l + |w|)^^ by ||xiir/i||ii + 
||(xif?i)'||ii < 1, uniformly in /i, and by the admissibility of gt- By Assumption [Tfc) we deduce 
that T2 lies in a fixed norm ball of H^{U). For T4 we note Ky^"^)'! < 1, sup^^^Q „ \TKh{u)\ ^ 
IlKlli < 00, sup,(|^^ MWm < 00 by Lemmas H and m so {F-^Hip"^)' {-u)FgiFkh]{*), \t\ > C} 
is bounded in iJi(R). For T3 we use \'~p~^{u)\ (1 + \u\)^'^~^'^/'^ and 

[ip-\-u)Fg^,FKn\ < 11(1 + l"l)-^.9t lU^ = MWh^ < 00, 

uniformly in \t\ ^ (, again by Lemmas 2] and IH We conclude that the norms \\T2 +T4||^i and 
||T3/a;||^(i+e)/2 are bounded uniformly in t e (— C: C)*^? h > 0. Each summand in 72 + Ta + T4 is 
therefore contained in a fixed P-Donsker-class: For T2 + T4 this follows from Proposition 1 in 
Nickl and Potscher (2007) with s = l,p — q ~ 2, and for T3 we apply Corollary 5 for weighted 
Besov-Sobolev spaces in Nickl and Potscher (2007) with parameter choice s = (1 +e)/2, f3 = —1, 
p = q = 2, ^ = e/2 noting that the moment condition there is satisfied by (12. 5p . The empirical 
process v!^ is thus indexed by functions T2 +T3 + that change with n but that are contained 
in a fixed P-Donsker class, and so is tight by the asymptotic equicontinuity criterion. Together 
with the tightness of the critical term, derived below, this proves tightness of t',^ . □ 

Combining the convergence of the finite-dimensional distributions from Corollary |8] with 
Theorem [11] and the uniform bounds on the remainder and bias term we have succeeded in 
proving Theorem [21 
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4.2.3 The critical term 



Note that in the ill-posed case lini|„|_j.oo |<^(u)| — 0, for instance when ip{u) = (1 — iu)~°' , the 
class involving Ti with FKh = 1 is not P-Donsker even for P with bounded density. The reason 
is, roughly speaking, that J^^^[(p~^{—t)] * (e*~*l(_oo.t]) is then unbounded at t, and classes that 
contain functions unbounded at any point cannot be Donsker for such P, cf. the proof of Theorem 
7 in Nickl (2006). This implies that one cannot use h = 0, i.e., Kh = Sq, in the proofs, as could 
have been done in the 'noncritical' terms T2,T3,T4 above. Rather, one needs to exploit the fact 
that the kernel smooths out the singularities for h fixed, and if /i„ does not approach zero 
too fast, there is still hope to obtain a uniform central limit theorem, as shown in a different but 
conceptually related situation of Theorems 9 and 10 in Gine and Nickl (2008). 

As compactly supported kernels facilitate the arguments considerably, we introduce the trun- 
cated kernel 

:=^a[-CAC/2]- 

By the decay of K and K' from (j2.4D we can again treat the term involving Kh — k'"^^ by classical 

methods. Using \\Kh — K^^^Wbv ^ h^^'^ where is the usual bounded variation norm, we 

obtain 

\^-\-u)F[ixgl]F[KH k\^\u)\ < \cp-\-u)\{l + \u\r'h^-^ 

whence T-^[ip-^{-u)T[ixgl]T[Kh - Kj"^]] e H^{R) follows, even with in h shrinking and in t 
uniform norms. As for the terms T4 above, we thus deduce the uniform tightness of this term 
since norm balls in iJ^(R) are universally Donsker. 

Recalling gl{x) = a;~^e^~*l(_oo.t] (2;), the term involving the truncated kernel can be written 

as 

F-\^-\~u)F[ixgt]FK^^^] = zg(. - t) * 

with 

g(x) ~ T-^[ip-\~u){l + m)-i](x). (4.12) 

The regularity of g in the scale of Besov spaces Bp ^(R) iss=:(l+e)/2 for p ~ I and r — 00: 

Since m{u) — ip^^{~u){l + m)^^/^+'^/^ is a Fourier multiplier on b[^^^^^'^{W) by Lemma Ul^c), 
this assertion follows from the fact that 

J-M(l + tu)-'^'--^']{x) = r(l/2 + e/2)-^\x\^/'~^/'en^.^^o]{x) 

(a Gamma-type density) is an element of that space. The latter follows cither by checking directly 
that its L^-modulus of smoothness satisfies Lu{h)i < h^^^^"^ or by noting that multiplication by 
(1 -I- m)'^^*^)/^ in the Fourier domain is an isomorphism between b[^^^''^^ (R) and Bl ^(S.) and 
J^~^[(l -I- iu)~-^]{x) — e'^l(_oo,o](^) is of bounded variation and thus contained in ^^^^(M). 
Moreover, by embedding theorems for Besov spaces, q is then also an element of Bf i{M.) for any 
s < (1 e)/2 and thus also of L'^(R) n L'^{R) . We refer to Triebel (2010) for these standard 
properties of Besov spaces. 

We are thus left with proving tightness of 

f {qi.-t)*Kj^^){x){Pn~P){dx) = V^. f q{y^t){Kl^K{Pr,-P)){y)dy, \t\^C, (4.13) 

JR JR 

which is a smoothed empirical process indexed by 

T^{qi.-t):\t\^C}- (4.14) 
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The following general purpose result follows from the proof of Theorem 3 in Gine and Nickl 
(2008), which builds on fundamental ideas in the classical paper Gine and Zinn (1984), and can 
be applied to the unbounded processes relevant here. For a given class of measurable functions 
T we write 

-^5 = {/-5:/,.9e-^,ll/-.9ll2,P^<5}- 

We shall rather loosely use the standard empirical process terminology from Gine and Nickl 
(2008). 

12 Theorem. Let T he any P -pregaussian class of real-valued functions on and let {nn}^=i 
be a sequence of finite signed measures defined on R'' satisfying sup„ \\lJ.n\\ < oo. Let fln{A) — 
/i„(— A). Assume that T C i^(|/i„|) holds for every n and, in addition, 

(a) for each n, the class Tn '■— {f * ff-ri '■ f ^ ^} consists of functions whose absolute values are 
bounded by a constant M„; 

(h) sup^gjjr/ E{f * Jxnixyf' ^ 4(5^ for every 6 > {) and n ^ = no{6) large enough; 

(c) for i.i.d. Rademacher variables {si)i, independent of the Xi 's, we have 



1 " 



/n 

i=l 



^ (4.15) 



as 71 — > cx) in outer probability; 



(d) U„^iJ-"„ is in the L^{P)-closure o/sup„ ||/x„|| -times the symmetric convex hull of some fixed 
P -pregaussian class of functions T . 

(e) For all < T] < 1, the L^{P) -metric entropy of satisfies H(T„, L^(P),r]) ^ ^n(jl)l'n'^ 
for functions Xn{rj) such that Xniv) ~^ '^'^^ Xn{ri)/rj'^ — > oo as 77 —> 0, uniformly in n, and 
the hounds Mn of part (a) satisfy 

^ (^5,/K{l/n^^ (4.16) 

for all n large enough. 

Then ^Jn[Pn — P) * /i„ is uniformly tight in the Banach space {T) ( equipped with the uniform 
norm ||»||jrj. 

Proof. The differences to Theorem 3 in Gine and Nickl (2008) are: We do not require /Lt„(M) — 
1 Vn, and (b) is slightly weakened, both permitted as we only establish tightness in this theorem 
and not convergence of the finite-dimensional distributions. Moreover the new condition (d), 
which replaces translation invariance of by a more generic condition. Note that Theorem 0.3 
in Dudley (1973) implies that L^(P)-closures of symmetric convex hulls of pregaussian classes 
are again pregaussian, which is all that is needed for the proof of Theorem 3 in Gine and Nickl 
(2008) to apply. □ 

We now verify these conditions for the classes above, with dfiniu) = Kj^\y)dy. Let us first 
show that the class from (|4.14p is indeed P-pregaussian. By Proposition [TUl applied to 



f{x) — e^(e *l(a; ^ t) — e '^l{x ^ s)), t,s ^ —C, (and symmetrically for t, 
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and by the same estimates as in the proof of Theorem [S] 

\\q{. -t)- q(. - .s)\\2,P <\t~ ,|e(3+2e)/(2+2e)^ (4^^^) 

Moreover, the tail bound for the third term in that proof apphes exactly here such that the same 
arguments show that J- has polynomially growing covering numbers and is thus pregaussian. In 
particular, J" is bounded in L'^{P). The functions g(. - t) are in b[]^'''>^'^ (R) C L\M.) D L'^{R) 
and thus in L^(|/i„|) since K is bounded. 

(a) the envelopes of q{t — t) * Kf^^ are of order M„ < h^" for a' e ((1 — s)/2,a) when 
h = hn 7^-i/(4o!) since the sup-norm is bounded by the BV-norm, which in turn is 
bounded in point (c) below. 

(b) Let g e F's, then * gh^p < Hi^f ^ * g - 9\\2,p + 5 and the result follows from the 
triangle inequality if we show \\k]^'' * / — /||2,p — > uniformly over / e J". From (|4.8p 
above, noting supp(i4:f ^ * {i*9i)) n (-C/2, C/2) = 0, we conclude 

lli^r * / - /II2.P < 11(1 + \u\r{^K^H^ + lliff * / - /lU- 

Since J-Kj^\u) is uniformly bounded and tends to 1 pointwise and since (1 + 
is integrable, by dominated convergence the first norm tends to zero for /i — > 0. Similarly, 
as Tf G holds, \\{J-Kj:^'' — l)J"/||/,2 — )• follows and by Plancherel's theorem also the 
second norm converges to zero. This convergence is uniform because of |J-"/(u)| = [^-"^(u)! 
for aU / e J" and since q e L^(R). 

(c) The class {Kj^^ * q[m ~t) : \t\ ^ (.} consist of translates of the fixed function Kf^^ * q, which 
is a function of bounded variation with BV-norm of size for some a' e ((1 — s)/2,a) 
using q G BI^^" (M) from the argument after (|4.12p and the estimate (61) in Gine and 
Nickl (2008) (whose proof applies also to the truncated kernels). The envelope Af„ of J^n 
is then of the same size since the BV-norm bounds the supremum norm. Moreover the 
class {kI^^ * q{m — t) : \t\ ^ C} has polynomial L^((5)-covering numbers, uniformly in all 
probability measures Q. To see this we argue as in Lemma 1 in Gine and Nickl (2009): note 
that a function of bounded variation is the composition of a 1-Lipschitz function with a 
monotone function. The set of all translates of a monotone function has VC-index 2, and 
hence has polynomial covering numbers by Theorem 5.1.15 in de la Peha and Gine (1999), 
with constants v there independent of n. Composition with a 1-Lipschitz map preserves 
the entropy, and the estimate (22) in Gine and Nickl (2008) with envelopes M„ ^ h~°' 
and Hn{ri) = H{ri) ^ log(r7) now shows that 



E 



1 " 



/n . ^ 



< 



■ logn 



1/1 



as n — >■ 00, in view of ft.„" < ft.„" < n^/^. 

(d) Using that xj^'^ is supported in [—C/2, C/2], one shows by standard arguments that the 
class of functions 



[ q{x-t- y)Kf\y)dy : \t\ ^ A 
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is in the L^(P)-closure of ||iir||^i-times the symmetric convex hull of the P-pregaussian 
class T — {q{» — t) : \t\ ^ C/2} ■ To see this one can either make a minor modification of the 
argument in Lemma 1 in Gine and Nickl (2008), or notice that, {q(» — €) : \t\ ^ C/2} being 
bounded in the separable Banach space L'^{P) (cf. after (|4.17p l the integrals J q{* — t — 
y)Kj:^\y)dy are L^(P)-valued Bochner-integrals, and can thus be obtained as L^(P)-limits 
of simple functions lying in the symmetric convex hull of {z n- ||iir||/^i(7(z — t) : \t\ ^ C/2} 
(e.g.. Appendix E and Theorem E.3 in Dudley (1999)). 

(e) Write f,g for distinct translates of q (elements of J-), and deduce from Minkowski's in- 
equality for integrals that 



{e [if . kI^\x) - g . Kj^\x)r]) 



1/2 



\Kn{u)\\\f{-u- 

-C/2 

K\\l^ sup ||/(u- 

|f|^C/2 



- .) - g{-u - ')\\2,pdu 
.) - g{u- .)\\2,p. 



Since entropy bounds are preserved under Lipschitz transformations, and since 

{q{u t):\t\^C, \u\ ^ C/2} C {q{u - . - t) : \t\ ^ C/2} 

has polynomial i^(P)-covering numbers by the same arguments as after (j4.17p . we deduce 
the bound iJ(J-Vi, i^(P), ?]) < log(r/^^) for every 77 > small enough, independent of n. 
Conclude that we can take Xnirf) — logl??^^)^^! so that the envelope condition (I4.16P 
becomes 



< 



(log n; 



,-l/2„l/4 



(4.18) 



which is satisfied due to a' < a and < n^/*, completing the proof. 
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